LRTwiki: Enriching the Likelihood Ratio Test with Encyclopedic Information for the Extraction of Relevant Terms
نویسندگان
چکیده
This paper introduces LRTwiki, an improved variant of the Likelihood Ratio Test (LRT). The central idea of LRTwiki is to employ a comprehensive domain specific knowledge source as additional “ontopic” data sets, and to modify the calculation of the LRT algorithm to take advantage of this new information. The knowledge source is created on the basis of Wikipedia articles. We evaluate on the two related tasks product feature extraction and keyphrase extraction, and find LRTwiki to yield a significant improvement over the original LRT in both tasks.
منابع مشابه
Modified signed log-likelihood test for the coefficient of variation of an inverse Gaussian population
In this paper, we consider the problem of two sided hypothesis testing for the parameter of coefficient of variation of an inverse Gaussian population. An approach used here is the modified signed log-likelihood ratio (MSLR) method which is the modification of traditional signed log-likelihood ratio test. Previous works show that this proposed method has third-order accuracy whereas the traditi...
متن کاملComparison of Local and Non-Local Methods in Covariance Matrix Estimation by Using Multi-baseline SAR Interferometry and Height Extraction for Principal Components with Maximum Likelihood Approach
By today, the technology of synthetic aperture radar (SAR) interferometry (InSAR) has been largely exploited in digital elevation model (DEM) generation and deformation mapping. Conventional InSAR technique exploits two SAR images acquired from slightly different angles, in which the information of elevation and deformation can be captured through processing of the phase difference of the image...
متن کاملMapping the Potential of Groundwater Resources in Hard Formations Using Geographic Information System and Remote Sensing, Case Study: Northwest of Shahroud
In recent years, rapid population growth has led to increase per capita water use in various sectors including agriculture and industry and a growing gap between water demand and water supply has emerged. Therefore, identifying and tracking changes in groundwater resources as an alternative and reliable source of surface water resources are so important to region located in the Middle East with...
متن کاملPitman-Closeness of Preliminary Test and Some Classical Estimators Based on Records from Two-Parameter Exponential Distribution
In this paper, we study the performance of estimators of parametersof two-parameter exponential distribution based on upper records. The generalized likelihood ratio (GLR) test was used to generate preliminary test estimator (PTE) for both parameters. We have compared the proposed estimator with maximum likelihood (ML) and unbiased estimators (UE) under mean-squared error (MSE) and Pitman me...
متن کاملEnriching Ontologies with Encyclopedic Background Knowledge for Document Indexing
The rapidly increasing number of scientific documents available publicly on the Internet creates the challenge of efficiently organizing and indexing these documents. Due to the time consuming and tedious nature of manual classification and indexing, there is a need for better methods to automate this process. This thesis proposes an approach which leverages encyclopedic background knowledge fo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009